22 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    SCADA and related technologies

    Get PDF
    Presented at SCADA and related technologies for irrigation district modernization: a USCID water management conference on October 26-29, 2005 in Vancouver, Washington.Section 203(a) of the Central Utah Project Completion Act authorized a replacement project--the Uinta Basin Replacement Project (UBRP) - to replace the Uinta and Upalco Units of the Central Utah Project (CUP) which were not constructed. The UBRP will provide: 2,000 acre-feet of irrigation water; 3,000 acre-feet of municipal and industrial water; reduced wilderness impacts; increased instream flows; and improved recreation. On the Lake Fork River, UBRP must be integrated into a complex water environment. The SCADA information generated by UBRP will play a key role in reducing uncertainty for water users - which is expected to have an economic impact. Construction delays in enlarging Big Sand Wash Dam and Reservoir eliminated the ability of the Moon Lake Water Users Association (Association) to store any substantial amount of water behind the old dam during the 2005 irrigation season. In response to this crisis, the partners in the project expanded the planned installation of SCADA monitoring and automation at key sites and required the installation to be completed over a period of weeks instead of years. The objective was to mitigate the effect of the lost storage by increasing flexibility and finetuning operations. The effort was largely successful

    Global optimisation applied to molecular architecture

    No full text
    Examines the problem of "identifying the configurations of molecular structures which correspond to the globally minimum potential energy for that structure".. This thesis addresses the problem of identifying configurations of molecular structures which correspond to the globally minimum potential energy for that structure. Molecular structures arise as a result of non-bonded and bonded atomic interactions and experimental evidence shows that, in the great majority of cases, the potential energy global minimum corresponds to the most stable configuration of the molecular structure. This configuration is of particular importance as it dictates most of the physical properties of the molecular structure. The potential energy of a molecular structure may be calculated, as a function of the atomic positions, using appropriate molecular models. However, as these give rise to potential energy functions that are typically nonconvex with many local minima, finding the global minima is an extremely difficult problem. For many years this problem has been investigated by chemists and physicists however, in more recent years, researchers from optimisation and computer science have also become involved and, in fact, the minimisation of non-convex potential energy functions arising from molecular conformation or protein folding problems has become one of the rnost important interdisciplinary problems [43]. This thesis develops and analyses a molecular structure global optimisation method using both deterministic local and stochastic global optimisation techniques within a genetic algorithm based environment. By incorporating different genetic operators, the one basic method was able to globally optimise a numlber of different types of molecular structures. From an experimental point of view, the method was particularly successful and found all currently accepted global minima for scaled Lennard-Jones atomic clusters of 2 to 80 atoms. two new global minima for 77 and 78 atom scaled Lennard-Jones atomic clusters . all currently accepted and some improved global minima for mixed argon-xenon atomic clusters of 7, 13 and 19 atoms. In addition, minima were determined for all remaining clusters in the 2, .... ,20 atom range. all currently accepted global minima for clusters of benzene molecules of 2 to 6 molecules and new minima for clusters of 8 to 12 molecules. all currently accepted global minima for a two-dimensional model molecular structure where the number of atoms ranged from 3 to 42. currently accepted global minima for a number of small molecules. Of particular importance is that, in determining these global minima, the method always started from randomly generated initial configurations and at no stage used any heuristic information to accelerate the search. From a theoretical point of view, this thesis presents an analytical comlparisonof the phenotype crossover operators used in the method with the more standard (genotype) crossover operators normally used in genetic algorithms. This analysis is confirmed with experimental results. In addition, a proof of convergence for the stochastic global optimisation technique used within the genetic algorithm environment and analytical evaluation of all potential energy gradients required by the deterministic local optimiser are presented. Chapter 1 of this thesis describes the molecular architecture problem and presents a review of local and global optimisation techniques. Chapter 2 describes the development of APSE, the stochastic global optimisation technique used in this study while the results obtained by applying APSE to the pure atomic cluster problem are presented in Chapter 3. Chapter 4 describes the development of GEM*, the major computational method used in this study. GEM* implements a combination of local optimisation and APSE probabilistic searches witrlin a genetic algorithm based environment. The results obtained by applying GEM* to the pure atomic cluster problem and a theoretical comparison of phenotype genetic crossover operators with rnore standard genetic crossover operators are presented in Chapter 5. The results obtained by applying GEM* to mixed argon-xenon atomic cluster problems are described in Chapter 6 while the optimisation of clusters of benzene and water molecules by GEM* is discussed in Chapter 7. Chapter 8 describes the GEM* optimisation results obtained for a model molecular structure and Chapter 9 presents the GEM* optimisation results for a number of small molecules. A summary and future research directions are presented in Chapter 10 while the appendices contain the analytical derivation of the potential energy gradients required for the implementation of the BFGS local optimiser and tables describing the structures obtained for mixed atomic clusters. Within this thesis Chapter 2 and Sections 3.3.1 and 3.4.1 appeared in the Australian Computer Journal, Vol. 28, No.4, November 1996. Chapter 6 has been accepted for publication by the Journal of Computational Chemistry. Sections 4.2, 5.2, 5.3 and 5.4 have been submitted to the Journal ofGlobal Optimization. Section 3.3.2 appeared as Technical Report 95 - 010, Department of Mathematics and Computing, Central Queensland University. Chapter 7 appeared as Technical Report 96 - 005, Department of Mathematics and Computing, Central Queensland University. Chapters 8 and 9 appeared as Technical Report 96 - 006, Department of Mathmatics and Computing, Central Queensland University

    Cooperating local search for the maximum clique problem

    No full text
    info:eu-repo/semantics/publishe

    Dynamic local search for the maximum clique problem

    No full text
    In this paper, we introduce DLS-MC, a new stochastic local search algorithm for the maximum clique problem. DLS-MC alternates between phases of iterative improvement, during which suitable vertices are added to the current clique, and plateau search, during which vertices of the current clique are swapped with vertices not contained in the current clique. The selection of vertices is solely based on vertex penalties that are dynamically adjusted during the search, and a perturbation mechanism is used to overcome search stagnation. The behaviour of DLS-MC is controlled by a single parameter, penalty delay, which controls the frequency at which vertex penalties are reduced. We show empirically that DLS-MC achieves substantial performance improvements over state-of-the-art algorithms for the maximum clique problem over a large range of the commonly used DIMACS benchmark instances. 1

    Towards fewer parameters for SAT clause weighting algorithms

    No full text
    Abstract. Considerable progress has recently been made in using clause weighting algorithms such as DLM and SDF to solve SAT benchmark problems. While these algorithms have outperformed earlier stochastic techniques on many larger problems, this improvement has been bought at the cost of extra parameters and the complexity of fine tuning these parameters to obtain optimal run-time performance. This paper examines the use of parameters, specifically in relation to DLM, to identify underlying features in clause weighting that can be used to eliminate or predict workable parameter settings. To this end we propose and empirically evaluate a simplified clause weighting algorithm that replaces the tabu list and flat moves parameter used in DLM. From this we show that our simplified clause weighting algorithm is competitive with DLM on the four categories of SAT problem for which DLM has already been optimised.

    A Memetic Genetic Algorithm for the Vertex p

    No full text

    An Investigation of Variable Relationships in 3-SAT Problems

    No full text
    To date, several types of structure for finite Constraint Satisfaction Problems have been investigated with the goal of either improving the performance of problem solvers or allowing efficient problem solvers to be identified. Our aim is to extend the work in this area by performing a structural analysis in terms of variable connectivity for 3-SAT problems. Initially structure is defined in terms of the compactness of variable connectivity for a problem. Using an easily calculable statistic developed to measure this compactness, a test was then created for identifying 3-SAT problems as either compact, loose or unstructured (or uniform). A problem generator was constructed for generating 3-SAT problems with varying degrees of structure. Using problems from this problem generator and existing problems from SATLIB, we investigated the effects of this type of structure on satisfiability and solvability of 3-SAT problems. For the same problem length, it is demonstrated that satisfiability and solvability are different for structured and uniform problems generated by the problem generator
    corecore